lectures.alex.balgavy.eu

Lecture notes from university.
git clone git://git.alex.balgavy.eu/lectures.alex.balgavy.eu.git
Log | Files | Refs | Submodules

Introduction: Data.html (4374B)


      1 <?xml version="1.0" encoding="UTF-8"?>
      2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
      3 <html><head><link rel="stylesheet" href="sitewide.css"><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/><meta name="exporter-version" content="Evernote Mac 7.6 (457297)"/><meta name="altitude" content="-4.208069801330566"/><meta name="author" content="Alex Balgavy"/><meta name="created" content="2018-12-16 00:43:31 +0000"/><meta name="latitude" content="52.30035400390625"/><meta name="longitude" content="4.988170682800604"/><meta name="source" content="desktop.mac"/><meta name="updated" content="2018-12-16 01:27:21 +0000"/><title>Introduction: Data</title></head><body><h1>Introduction: Data</h1><div style="margin-top: 1em; margin-bottom: 1em;-en-paragraph:true;"><div>statistics: the science of data - collecting, organising, analysing, interpreting, presenting</div><div>
      4 sample: a selected subcollection from the population</div></div><h2>Collecting sample data</h2><div style="margin-top: 1em; margin-bottom: 1em;-en-paragraph:true;">concepts:</div><ul><li><div>variables:
      5 </div></li><ul><li><div>independent: might cause the effect being studied</div></li><li><div>dependent: represents the effect being studied</div></li></ul><li><div>confounding: when there’s too many variables and you have no clue wtf is causing the effect</div></li></ul><div style="margin-top: 1em; margin-bottom: 1em;-en-paragraph:true;">sampling methods:</div><ul><li><div>voluntary response: subjects decide to be included</div></li><li><div>random: each <span style="font-style: italic;">member</span> from population has equal probability to be selected</div></li><li><div>simple random: each <span style="font-style: italic;">sample of size n</span> has equal probability to be selected</div></li><li><div>systematic: after starting point, select every k-th member (based on a system)</div></li><li><div>convenience: choose what’s convenient</div></li><li><div>startified: split population into subgroups with same characteristics, simple random sample each group</div></li><li><div>cluster: split population into clusters, then randomly select some of them</div></li></ul><div style="margin-top: 1em; margin-bottom: 1em;-en-paragraph:true;">types of studies:</div><ul><li><div>observational study: subjects observed, not modified
      6 </div></li><ul><li><div>retrospective: data from past</div></li><li><div>cross-sectional: data from one point in time</div></li><li><div>prospective: data to be collected (future)</div></li></ul><li><div>experiment: some treatment applied to subjects
      7 </div></li><ul><li><div>sometimes control and treatment group</div></li><li><div>gotta watch out for placebo and observer effects</div></li></ul></ul><h2>Types of data</h2><div style="margin-top: 1em; margin-bottom: 1em;-en-paragraph:true;">What to do with data?</div><ul><li><div>parameter: numerical measurement of <span style="font-style: italic;">population</span> (in Greek: 
      8 <img src="Introduction%3A%20Data.resources/D876FCB8-9423-49FF-97D9-D75D8E9DCDAF.png" height="11" width="52"/>)</div></li><li><div>statistic: numerical measurement of <span style="font-style: italic;">sample</span> (in English: 
      9 <img src="Introduction%3A%20Data.resources/0D52418B-F420-4868-AEB2-BF252B84BC51.png" height="13" width="50"/>)</div></li></ul><div style="margin-top: 1em; margin-bottom: 1em;-en-paragraph:true;">data can be:</div><ul><li><div>qualitative: names or labels (strings)</div></li><li><div>quantitative: numbers (ints, floats)
     10 </div></li><ul><li><div>discrete: countable</div></li><li><div>continuous: not countable (on a continuous scale like length, weight, distance)</div></li></ul></ul><div style="margin-top: 1em; margin-bottom: 1em;-en-paragraph:true;">you have different levels of measurement:</div><ul><li><div>qualitative:
     11 </div></li><ul><li><div>nominal: no ordering (gender, eye color)</div></li><li><div>ordinal: ordering, but differences between categories have no meaning (e.g. agree/disagree)</div></li></ul><li><div>quantitative:
     12 </div></li><ul><li><div>interval: ordering, differences, but no natural zero point (year of birth, temperatures in F/C)</div></li><li><div>ratio: ordering, differences, natural zero point (body length, marathon times)</div></li></ul></ul><div><br/></div></body></html>